Zero-shot cross-lingual transfer language selection using linguistic similarity
نویسندگان
چکیده
We study the selection of transfer languages for different Natural Language Processing tasks, specifically sentiment analysis, named entity recognition and dependency parsing. In order to select an optimal language, we propose utilize linguistic similarity metrics measure distance between make choice language based on this information instead relying intuition. demonstrate that correlates with cross-lingual performance all proposed tasks. also show there is a statistically significant difference in choosing as source English. This allows us more suitable which can be used better leverage knowledge from high-resource improve applications lacking data. For study, datasets eight three families.
منابع مشابه
Zero-Shot Learning Through Cross-Modal Transfer
This work introduces a model that can recognize objects in images even if no training data is available for the objects. The only necessary knowledge about the unseen categories comes from unsupervised large text corpora. In our zero-shot framework distributional information in language can be seen as spanning a semantic basis for understanding what objects look like. Most previous zero-shot le...
متن کاملZero-shot Cross Language Text Classifica-
Labeled text classification datasets are typically only available in a few select languages. In order to train a model for e.g news categorization in a language Lt without a suitable text classification dataset there are two options. The first option is to create a new labeled dataset by hand, and the second option is to transfer label information from an existing labeled dataset in a source la...
متن کاملImage-Mediated Learning for Zero-Shot Cross-Lingual Document Retrieval
We propose an image-mediated learning approach for cross-lingual document retrieval where no or only a few parallel corpora are available. Using the images in image-text documents of each language as the hub, we derive a common semantic subspace bridging two languages by means of generalized canonical correlation analysis. For the purpose of evaluation, we create and release a new document data...
متن کاملZero-resource Dependency Parsing: Boosting Delexicalized Cross-lingual Transfer with Linguistic Knowledge
This paper studies cross-lingual transfer for dependency parsing, focusing on very low-resource settings where delexicalized transfer is the only fully automatic option. We show how to boost parsing performance by rewriting the source sentences so as to better match the linguistic regularities of the target language. We contrast a data-driven approach with an approach relying on linguistically ...
متن کاملSitNet: Discrete Similarity Transfer Network for Zero-shot Hashing
Hashing has been widely utilized for fast image retrieval recently. With semantic information as supervision, hashing approaches perform much better, especially when combined with deep convolution neural network(CNN). However, in practice, new concepts emerge every day, making collecting supervised information for re-training hashing model infeasible. In this paper, we propose a novel zero-shot...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Information Processing and Management
سال: 2023
ISSN: ['0306-4573', '1873-5371']
DOI: https://doi.org/10.1016/j.ipm.2022.103250